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Claims 

What is claimed is: 

1 . A method for use in providing improved fault tolerance in a computing system comprising 
at least one computing machine, the method comprising the steps of: 

executing a control program in conjunction with a fault tolerance software system 
running on the at least one computing machine; and 

initiating via the control program a test script program which sends one or more 
requests to a monitored program, wherein the test script program processes corresponding responses 

to the one or more requests, and generates at least one return value utilizable by the control program 

I 

to indicate a failure condition in the monitored program. 

2. The method of claim 1 wherein the computing system is configured in accordance with 
a client-server architecture and the at least one computing machine comprises a server of the 
computing system. 

3 . The method of claim 1 wherein the control program comprises a control thread of a failure 
detection process associated with a failure detection component of the fault tolerance software 
system. 

4. The method of claim 1 wherein the control program comprises a thread of a failure 
detection process and the test script program comprises a process separate from the failure detection 
process. 

5. The method of claim 1 wherein the control program comprises a thread of a failure 
detection process and the test script program comprises a thread of the same failure detection 
process. 

6. The method of claim 1 wherein the test script program is implemented in an object- 
oriented programming language such that one or more components of the test script program 
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comprise a base class from which one or more other components of the test script program are 
generatable for use with the monitored program. 

7. The method of claim 6 wherein the object-oriented programming language comprises the 
Java programming language. 

8. The method of claim 6 wherein the one or more components comprising the base class 
comprise one or more of an initialization component, an obtain requests component, and a request 
interruption component. 

9. The method of claim 8 wherein the one or more other components generatable from the 
base class comprise a request issuance component and a response verification component, both 
particular to the monitored program. 

10. The method of claim 9 wherein for each of the requests specified in the obtain requests 
component, that component creates corresponding ones of the request issuance component and the 
request interruption component, and wherein the request interruption component terminates the 
corresponding request if a response from the monitored program is not received within a designated 
period of time. 

1 1 . The method of claim 1 wherein the control program initiates a persistent program, and 
the persistent program periodically initiates the test script program. 

12. The method of claim 1 1 wherein the persistent program comprises at least one of a 
thread and a process. 

13. The method of claim 1 1 wherein the persistent program receives the return value from 
the test script program and delivers it to the control program. 
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14. The method of claim 1 wherein the test script program comprises an interpreted script. 

15. The method of claim 1 wherein the test script program comprises a native executable. 

16. The method of claim 1 wherein the test script program comprises byte code. 

17. An apparatus for use in providing improved fault tolerance in a computing system, the 
apparatus comprising: 

at least one computing machine having a processor and a memory, the processor 
being operatively coupled to the memory, wherein the processor is operative: (i) to execute a control 
program in conjunction with a fault tolerance software system running on the at least one computing 
machine, and (ii) to initiate via the control program a test script program which sends one or more 
requests to a monitored program, wherein the test script program processes corresponding responses 
to the one or more requests, and generates at least one return value utilizable by the control program 
to indicate a failure condition in the monitored program. 

18. A storage medium for storing program code for use in providing improved fauU 
tolerance in a computing system comprising at least one computing machine, wherein the program 
code when executed on the at least one computing machine performs the steps of: 

executing a control program in conjimction with a fault tolerance software system 
running on the at least one computing machine; and 

initiating via the control program a test script program which sends one or more 
requests to a monitored program, wherein the test script program processes corresponding responses 
to the one or more requests, and generates at least one return value utilizable by the control program 
to indicate a failure condition in the monitored program. 
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